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■ Abstract 

^ • This work is a tutorial on Shor's factoring algorithm by means of a worked out 

example. Some basic concepts of Quantum Mechanics and quantum circuits 
P5 ■ are reviewed. It is intended for non-specialists which have basic knowledge 

on undergraduate Linear Algebra. 



1 Introduction 



In the last 30 years, the number of transistors per chip roughly doubled every 18 months, 
5^ I amounting to an exponentially growing power of classical computers. Eventually this 

statement (Moore's law) will be violated, since the transistor size will reach the limiting 
size of one atom in about 15 years. Even before that, disturbing quantum effects will 
appear. 

Ordinarily one states that if an algorithm is inefficient, one simply waits for hardware 
efficient enough to run it. If the exponential increase in the power of classical computers 
becomes saturated, the class of inefficient algorithms will remain useless. From this 
pessimistic point of view. Computer Science seems to face very narrow limitations in the 
near future, coming from the physics underneath computer architecture. 

It is important to keep in mind that a computer is a device governed by the laws of 
Physics. For decades, this fact was irrelevant. Computer Science emerged in a mathe- 
matical context and the specifics imposed by Physics were so few that most computer 
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scientists paid no attention to them. One possible explanation for this state of matters 
is that computers work under the laws of classical physics, which are common sense. 

In a seminal paper, Feynman T] argued that the way classical computers work is 
a special case of some more general form allowed by the laws of Quantum Mechanics. 
He gave general arguments supporting the idea that a manifestly quantum device would 
be exponentially faster than a classical one. Subsequently, Deutsch [2] generalized the 
classical circuit model to its quantum counterpart and gave the first example of a quantum 
algorithm faster than its classical counterpart. Based on Deutsch's work, Simon [3] 
developed a quantum algorithm exponentially faster than its classical counterpart, taking 
advantage of entanglement, corroborating with Feynman's arguments. 

The greatest success came with Shor's work 4,. He developed exponentially faster 
quantum algorithms for factoring integers and for finding discrete logarithms when com- 
pared to the known classical algorithms. Shor's algorithms allow one to render most 
current cryptographic methods useless, when a quantum computer of reasonable size is 
available. 

This work is an introductory review of Shor's factoring algorithm. We have put all 
our efforts to write as clear as possible for non-specialists. We assume familiarity with 
undergraduate Linear Algebra, which is the main mathematical basis of Quantum Me- 
chanics. Some previous knowledge of Quantum Mechanics is welcome, but not necessary 
for a persistent reader. The reader can find further material in El 13 El • 

Section 2 reviews basic notions of Quantum Mechanics necessary for Quantum Com- 
putation. Section 3 introduces the notion of quantum circuits and presents some basic 
examples. Section 4 describes how factorization can be reduced to order calculation and 
Section 5 gives a quantum algorithm for it. Section 6 shows the quantum Fourier trans- 
form. Section 7 gives an example and finally Section 8 shows the decomposition of the 
Fourier transform circuit in terms of the universal gates. 



Review of Quantum Mechanics for Quantum Computa- 
tion 



In classical computers, a bit can assume only values or 1. In quantum computers, the 
values and 1 are replaced by the vectors 1 0) and 1 1) . This notation for vectors is called 
the Dirac notation and is standard in Quantum Mechanics. The name bit is replaced by 
qubit, short of quantum bit. The difference between bits and qubits is that a qubit \tp) 
can also be in a linear combination of the vectors |0) and 



a\0)+P\l) 



(1) 



where a and /? are complex numbers, {ip) is said to be a superposition of the vectors |0) 
and |1) with amplitudes a and (3. Thus, {ip) is a vector in a two-dimensional complex 
vector space, where {|0), |1)} forms an orthonormal basis, called the computational basis 
(see Fig. ^in the real case). The state |0) is not the zero vector, but simply the first 
vector of the basis. The matrix representations of the vectors |0) and |1) are given by 



|0) 



1 




and |1) 




1 
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In Quantum Mechanics, vectors are systematically called states. We use this term from 
now on. 

The physical interpretation of {ip) is that it coexists in two states: |0) and |1). It 
is similar to a coin that is partially heads up and partially tails up simultaneously. We 
cannot push further the analogy simply because quantum phenomena do not have a 
classical analogue in general. The state {ip) can store a huge quantity of information 
in its coefficients a and /3, but this information lives in the quantum level, which is 
microscopic (usually quantum effects appear in atomic dimensions). To bring quantum 
information to the classical level, one must measure the qubit. Quantum Mechanics tells 
us that the measurement process inevitably disturbs a qubit state, producing a non- 
deterministic collapse of \ip) to either |0) or One gets |0) with probability |ap or |1) 
with probability |/3p. The non-deterministic collapse does not allow one to determine 
the values of a and P before the measurement. They are inaccessible via measurements 
unless one has many copies of the same state. Two successive measurements of the same 
qubit give the same output. If |ap and are probabilities and there are only two 
possible outputs, then 

|a|2 + |/3|2 = l. (2) 

Calculating the norm of Eq. ((2} gives 

|||V,)|| = V|aP + |/3|2 = l. 

A measurement is not the only way that one can interact with a qubit. If one does not 
obtain any information about the state of the qubit, the interaction changes the values 
of a and /? keeping the constraint (jSJ . The most general transformation of this kind is a 
linear transformation U that takes unit vectors into unit vectors. Such a transformation 
is called unitary and can be defined by 

u^u = UU^ = I, 

where = (U*)'^ (* indicates complex conjugation and T indicates the transpose oper- 
ation) and / is the 2x2 identity matrix. 

So far we are dealing with one-qubit quantum computers. To consider the multiple 
qubit case, it is necessary to introduce the concept of tensor product. Suppose V and 
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Figure 1: Computational basis for the case a, /3 real. In the general case (a, /3 
complex) there is still a geometrical representation called the Bloch sphere (see 
page 15). 
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W are complex vector spaces of dimensions m and n, respectively. The tensor product 
y^VF is an mn-dimensional vector space. The elements of y^PF are linear combinations 
of tensor products \v) ® {w), satisfying the following properties (z G C, \v), \vi), \v2) £ V, 
and \w), {wi), \'W2) € W): 

1. z{\v) (8) \w)) = {z\v)) (8) \w) = \v) (8) (zlw)), 

2. (|fl) + \V2)) \w) = {\vi) (g) \w)) + {\V2) (8 \w)), 

3. \v) (8 {\wi) + \W2)) = {\v) (8 \wi)) + {\v) (g) \W2)). 

We use also the notations \v)\w), \v,w) or \vw) for the tensor product \v) \w). Note 
that the tensor product is non-commutative, so the notation must preserve the ordering. 

Given two linear operators A and B defined on the vector spaces V and W, respec- 
tively, we can define the linear operator AfgBonVfgWas 



{A (g) B){\v) <S)\w)) = A\v) (8 B\w), 
where \v) G V and \w) G W. The matrix representation Afg B is given by 

AuB ■ ■ ■ AimB 



(3) 



A^B 



AmlB 



A R 



(4) 



where ^ is an m x m matrix and is a n x n matrix (we are using the same notation 
for the operator and its matrix representation). So, the matrix A® B has dimension 
mn X ran. For example, given 



A 



the tensor product A® B \?, 



1 

1 



and B 



A®B 



1 

1 



1 
1 
1 



1 
1 
1 



1 

1 

1 

1 
1 
1 



The formula (@J) can also be used for non-square matrices, such as the tensor product of 
two vectors. For example, if we have a 2-qubit quantum computer and the first qubit is 
in the state |0) and the second is in the state |1), then the quantum computer is in the 
state |0) (8 |1), given by 



|0) ® |1) 



101) 



" 1 ■ 




' ■ 




1 







1 












(5) 
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The resulting vector is in a 4-dimensional vector space. 

The general state |^) of a 2-qubit quantum computer is a superposition of the states 
|00), |01), |10), and |11), 

IV-) = a|00)+/3|01)+7|10)+5|ll), (6) 

with the constraint 

I«|' + |/3|' + |7P + |5P = 1- 

Regarding the zeroes and ones as constituting the binary expansion of an integer, we can 
replace the representations of states 

|00), |01), |10), 111), 

by the shorter forms 

|0), |1), |2), |3), 

in decimal notation, which is handy in some formulas. 

In general, the state \ip) of an n-qubit quantum computer is a superposition of the 
2" states |0), |1), |2'*- 1), 

2"-l 
i=0 

with amplitudes aj constrained to 

2"-l 
i=0 

Recall that the orthonormal basis {|0) , . . . , |2" — 1)} is the computational basis in decimal 
notation. The state of an n-qubit quantum computer is a vector in a 2"-dimensional 
complex vector space. When the number of qubits increases linearly, the dimension of 
the associated vector space increases exponentially. As before, a measurement of a generic 
state yields the result \io) with probability |aigp, where < io < 2"". Usually, the 
measurement is performed qubit by qubit yielding zeroes or ones that are read together 
to form iQ. We stress again a very important feature of the measurement process. The 
state 1^) as it is before measurement is inaccessible unless it is in the computational basis. 
The measurement process inevitably disturbs forcing it to collapse to one vector of 
the computational basis. This collapse is non-deterministic, with the probabilities given 
by the squared norms of the corresponding amplitudes in |^). 

If we have a 2-qubit quantum computer, the first qubit in the state 

\ip)=a\0)+b\l) 

and the second in the state 

|V') = c|0)+d|l), 
then the state of the quantum computer is the tensor product 

= (a|0)+6|l))®(c|0)+d|l)) (7) 
= oc|00) + ad\01) + bc\10) + bd\ll). 
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Note that a general 2-qubit state ^ is of the form if and only if 



a 
P 
7 
6 



ac, 
ad, 
be, 
bd. 



From these equalities we have that a general 2-qubit state (jHl is of the form ((7j) if and 
only if 

a6 = (3^. 

Thus, the general 2-qubit state is not necessarily a product of two one-qubit states. 
Such non-product states of two or more qubits are called entangled states, for example, 
(|00) + |ll))/\/2- The entangled states play an essential role in quantum computation. 
Quantum computers that do not use entanglement cannot be exponentially faster than 
classical computers. On the other hand, a naive use of entanglement does not guarantee 
any improvements. 

A complex vector space y is a Hilbert space if there is an inner product, written in 
the form {ip\ip), defined by the following rules (a, 6 G C and \ip), \u), \v) G V): 

1. = {^\i,Y , 

2. {^\{a\u) + b\v))) = a{^\u) + b{ip\v), 

3. {if\ip) > if \^) + 0. 

The norm of a vector is given by 



II IV') II = 

The notation (99] is used for the dual vector to the vector |(^). The dual is a linear 
operator from the vector space V to the complex numbers, defined by 

{^\{\v)) = {^\v), V|t;)Gy. 

Given two vectors \i~p) and in a vector space V , there is also an outer product 
\'\\))(lp\, defined as a linear operator on V satisfying 



If |(^) = a|0) and 
and outer products are: 



= c|0) + d\l), then the matrix representations for inner 



[a* b* ] 



a 
b 



c 
d 

d* 



a*c + b*d, 

ac* ad* 
be* bd* 



The matrix of the outer product is obtained by usual matrix multiplication of a column 
matrix by a row matrix. But in this case, we can replace the matrix multiplication by 
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IV'l) 




Figure 2: The sketch of the quantum computer. We consider the input nonentan- 
gled, which is reasonable in generaL On the other hand, the output is entangled 
in general. The measurement of the state {ip), not shown here, returns zeroes 
and ones. 



the tensor product, i.e., \ f){'(p\ = \(p)'Si{'4'\ (notice the complex conjugation in the process 
of taking the dual). 

After the above review, we are ready to outline the quantum computer. In Fig. [21 
we are taking a nonentangled input, what is quite reasonable. In fact, IV'j) is either |0) 
or |1) generally. \^), on the right hand side of Fig. [21 is the result of the application 
of a unitary operator U on the input. The last step is the measurement of the states 
of each qubit, which returns zeroes and ones that form the final result of the quantum 
calculation. Note that there is, in principle, an infinite number of possible operators U, 
which are unitary 2" x 2" matrices. 



3 Quantum Circuits 

Let us start with one-qubit gates. In the classical case there is only one possibility, which 
is the NOT gate. The NOT gate inverts the bit value: goes to 1 and vice-versa. The 
straightforward generalization to the quantum case is given in Fig. [31 where X is the 
unitary operator 



So, if the input \^) is |0), the output is |1) and vice-versa. But now we can have a 
situation with no classical counterpart. The state can be a superposition of states |0) 
and The general case is = a\0)+P\l) and the corresponding output is a|l) -|-/?|0). 

The gate X is not the only one-qubit gate. There are infinitely many, since there are 
an infinite number of 2 x 2 unitary matrices. In principle, any unitary operation can be 



X 



x|v> 



Figure 3: Quantum NOT gate. 
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implemented in practice. The Hadamard gate is another important one-qubit gate, given 

by 



H 



1 



1 



It is easy to see that 



H\0) = 
H\l) = 



V2[l -1 , 

|o) + |i; 



V2 

|o> - |i) 



V2 



If the input is |0), the Hadamard gate creates a superposition of states with equal weights. 
This is a general feature, valid for two or more qubits. Let us analyze the 2-qubit case. 
The first example of a 2-qubit gate is (g) iJ: 



i?®2|o)|o) 



{H H)i\0) |0)) = H\0) H\0) 



|0) + |i) 
V2 



|0) + |1) 
V2 



= -(|0)|0) + |0)|1) + |1)|0) + |1)|1)) 

= ^(|0) + |1) + |2) + |3)). 

The result is a superposition of all basis states with equal weights. More generally, the 
Hadamard operator applied to the n-qubit state |0) is 



ii'®"|0,...,0) = iH\0)Y 

|0) + |1) ^~ 
V2 



2"-l 
'on ' ' 



(8) 



Thus, the tensor product of n Hadamard operators produces an equally weighted super- 
position of all computational basis states, when the input is the state |0). This state is 
useful for applying quantum parallelism, as we will sec ahead. 

Another important 2-qubit quantum gate is the CNOT gate. It has two input qubits, 
the control and the target qubit, respectively. The target qubit is flipped only if the 
control qubit is set to 1, that is. 



|00) 
|01) 
1 10) 

111) 



|00), 

|oi), 

111), 

|10). 



(9) 



The action of the CNOT gate can also be represented by 

\a, b) — > |a, a © 6), 
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Figure 4: CNOT gate. \i) can be either |0) or |1). The general case is obtained 
by hnearity. 



where © is addition modulo 2. Now, let us obtain its matrix representation. Performing 
the same calculations that yield Eq. we have 



" 1 ■ 




" " 




" " 




■ " 






, |oi) = 


1 




, |io) = 




1 


, lll) = 





















1 



1 00) 



Thus, from (jUJ and (|10|) . the matrix representation f/cNOT of the CNOT gate is 



(10) 



1 

1 



1 



Fig. |1] describes the CNOT gate, where \i) is either |0) or |1). The figure could 
lead one to think that the output is always nonentangled, but that is not true, since if 
the first qubit is in a more general state given by a|0) + then the output will be 
a 1 0) I o") + 6 |1) X I cr), which is entangled in general. The input can be entangled as well. 

We have seen two examples of 2-qubit gates. The general case is a 4 x 4 unitary 
matrix. Gates that are the direct product of other gates, such as H ® H, do not produce 
entanglement. If the input is nonentangled, the output is not too. On the other hand, 
the output of the CNOT gate can be entangled while the input is nonentangled. 

CNOT and one-qubit gates form a universal set of gates. This means that any other 
gate, operating on 2 or more qubits, can be written as compositions and direct products 
of CNOT and one-qubit gates 0. 

At the end of Section 2, we gave a general outline of the quantum computer (Fig. 2) 
based on the action of a unitary operator U . In the present section, we have seen that 
in general U can be broken up in terms of smaller gates. This decomposition is useful 
because it corresponds to the natural steps that describe an algorithm. So, a quantum 
algorithm consists of a sequence of unitary operators acting on sets of qubits. These 
unitary operators multiplied together form the operator U of Fig. 2. 

More details on quantum circuits can be found in |H1 |H] . 

4 Factorization can be reduced to order calculation 



Let us describe Shor's algorithm for finding the prime factors of a composite number 
N . Think of a large number such as one with 300 digits in decimal notation, since such 
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numbers are used in cryptography. Though N is large, the number of qubits necessary 
to store it is smah. In general log2 is not an integer, so let us define 

n= \l0g2N]. 

A quantum computer with n qubits can store or any other positive integer less than 
N. With a little thought, we see that the number of prime factors of N is at most n. If 
both the number of qubits and the number of factors are less than or equal to n, then 
it is natural to ask if there is an algorithm that factors in a number of steps which is 
polynomial in n. More technically, the question is: is there a factorization algorithm in 
the complexity class V 10 ? 

Reduction of factorization of N to the problem of finding the order of an integer x 
less than N is as follows. If x and N have common factors, then GCD{x, N) gives a 
factor of N, therefore it suffices to investigate the case when x is coprime to N. The 
order of x modulo N is defined as the least positive integer r such that 

x^ = 1 mod N. 

If r is even, we can define y by 

j;'"/^ = y mod A^. 

The above notation means that y is the remainder of x^/^ divided by A^ and, by definition, 
< y < N. Note that y satisfies = 1 modulo A^, or equivalently (y — l)(y + 1) = 
modulo A^, which means that A^ divides (y — l){y + 1). If 1 < y < A^ — 1, the factors 
y — 1 and y + 1 satisfy 0<y — l<y + l<A^, therefore A^ cannot divide y — 1 nor y + 1 
separately. The only alternative is that both y — 1 and y + 1 have factors of A^ (that yield 
A'^ by multiplication). So, GCD(y — 1, A^) and GCD(y + 1, A^) yield non trivial factors of 
A'^ (GCD stands for the greatest common divisor). If A^ has remaining factors, they can 
be calculated applying the algorithm recursively. 

Consider A^ = 21 as an example. The sequence of equivalences 

16 mod 21 
11 mod 21 
11 X 2 = 1 mod 21 

show that the order of 2 modulo 21 is r = 6. Therefore, y = 2'^ = 8 modulo 21. y — 1 
yields the factor 7 and y + 1 yields the factor 3 of 21. 

In summary, if we pick up at random a positive integer x less than A^ and calculate 
GCD(x, A^), either we have a factor of A^ or we learn that x is coprime to A^. In the latter 
case, if x satisfies the conditions (1) its order r is even, and (2)0<y — l<y + l<A^, 
then GCD(y - 1, A^) and GCD(y + 1, A^) yield factors of A^. If one of the conditions 
is not true, we start over until finding a proper candidate x. The method would not 
be useful if these assumptions were too restrictive, but fortunately that is not the case. 
The method sistematically fails if A^ is a power of some odd prime, but an alternative 
efficient classical algorithm for this case is known. If A'" is even, we can keep dividing by 
2 until the result turns out to be odd. It remains to apply the method for odd composite 
integers that are not a power of some prime number. It is cumbersome to prove that the 
probability of finding x coprime to A^ satisfying the conditions (1) and (2) is high; in fact 



2^ = 

25 ^ 

26 ^ 
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this probability is 1 — 1/2*^^^, where k is the number of prime factors of A^. In the worst 
case (A'" has 2 factors), the probabihty is greater than or equal to 1/2 (see the proof in 
Appendix B of [HI). 

At first sight, it seems that we have just described an efficient algorithm to find a 
factor of N. That is not true, since it is not known an efficient classical algorithm to 
calculate the order of an integer x modulo N. On the other hand, there is (after Shor's 
work) an efficient quantum algorithm. Let us describe it. 



5 Quantum algorithm to calculate the order 

Consider the circuit of Fig. |S1 It calculates the order r of the positive integer x less than 
N, coprime to A^. Vx is the unitary linear operator 

VA\j)\k)) = \j)\k + x^), (11) 

where \j) and \k) are the states of the first and second registers, respectively. The 
arithmetical operations are performed modulo A'^, so < A; + x-' < N. DFT is the 
Discrete Fourier Transform operator which will be described ahead. 

The first register has t qubits, where t is generally chosen such that A^^ < 2* < 2A^^, 
for reasons that will become clear later on [^. As an exception, if the order r is a power 
of 2, then it is enough to take t = n. In this section we consider this very special case 
and leave the general case for Section [3 We will keep the variable t in order to generalize 
the discussion later on. 

The states of the quantum computer are indicated by |^o) to iV's) in Fig. |S1 The 
initial state is 

iV'o) = |o_^|o_^. 

t n 



first register 
(t qubits) 



second register 
(n qubits) 




I t ' ' II II 

IV'o) IV'i) 1^-2) IV-a) IV'4) \i>5) 



Figure 5: Quantum circuit for finding the order of the positive integer x 
modulo N. 
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The application of the Hadamard operator 



H 



1 

V2 



on each qubit of the first register yields (see Eq. 



2*-l 



^ i=o 



(12) 



The first register is then in a superposition of all states of the computational basis with 
equal amplitudes given by Now we call the reader's attention to what happens when 
we apply Vx to IV'i): 



Vxli^i) 

j=0 

2*-l 



7^ E b-) 



(13) 



j=0 



The state 1-02) is a remarkable one. Because Vx is linear, it acts on all |j) |0) for 2* values 
of j, so this generates all powers of x simultaneously. This feature is called quantum 
parallelism. Some of these powers are 1, which correspond to the states |0) |1), |r) |1), 



|2r)|l) 



1 r 



1). This explains the choice Hll|) for Vx- Classically, one would 
, for j starting from 2 until reaching j = r. Quantumly, one 



calculate successively 

can calculate all powers of x with just one application of T4- At the quantum level, the 
values of j that yield x^ = 1 modulo are "known" . But this quantum information is 
not fully available at the classical level. A classical information of a quantum state is 
obtained by practical measurements and, at this point, it does not help if we measure 
the first register, since all states in the superposition ()13() have equal amplitudes. The 
first part of the strategy to find r is to observe that the first register of the states |0) |1), 
|r) |1), |2r) |1), |2* — r^\l) is periodic. So the information we want is a period. In 
order to simplify the calculation, let us measure the second register. Before doing this, 
we will rewrite \tp2) collecting equal terms in the second register. Since x^ is a periodic 
function with period r, substitute ar + b for j in Eq. H13() . where < a < (2*/t) — 1 and 
< 6 < r — 1. Recall that we are supposing that t = n and r is a power of 2, therefore r 
divides 2*. Eq. l\l'S\i is converted to 



\tp2 



^ r—1 I - 

TP ^ 

6=0 



\ar + b) 



a=0 



(14) 



In the second register, we have substituted x^ for x"^"*"^, since x*" = 1 modulo A^. Now 



the second register is measured. Any output x^ , x^. 



X 



r-l 



can be obtained with equal 
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probability. Suppose that the result is x''" . The state of the quantum computer is now 



(15) 



Note that after the measurement, the constant is renormalized to since there are 

2*/r terms in the sum p5() . Fig. |^ shows the probability of obtaining the states of the 
computational basis upon measuring the first register. The probabilities form a periodic 
function with period r. Their values are zero except for the states |6o), \r + bo), \2r + bo), 
|2*-r + 6o>- 

How can one find out the period of a function efficiently? The answer is in the 
Fourier transform. The Fourier transform of a periodic function with period r is a new 
periodic function with period proportional to 1/r. This makes a difference for finding r. 
The Fourier transform is the second and last part of the strategy. The whole method 
relies on an efficient quantum algorithm for calculating the Fourier transform, which is 
not available classically. In Section |S1 we show that the Fourier transform is calculated 
efficiently in a quantum computer. 



6 The quantum discrete Fourier transform 

— 1} — > C is a new function F : 



The Fourier transform of the function F : {0, 
{0, . . . , - 1} ^ C defined as 



N-l 



-y 



(16) 



We can apply the Fourier transform either on a function or on the states of the compu- 
tational basis. The Fourier transform applied to the state \k) of the computational basis 



Probability distribution 




r + bo 



2r + 6o 



3r + bo 



Terms of It/^s) 
(1st register) 



Figure 6: Probability distribution of \i]^^) measured in the computational basis 
(for the case 6o = 3 and r = 8). The horizontal axis has 2* points. The number 
of peaks is 2*/r and the period is r. 
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{|0),...,|iV-l)} is 



DFT(|A:)) = 



N-l 



(17) 



i=o 



where the set {\ipk) ■ k = 0, ... — 1} forms a new orthonormal basis. The Fourier 
transform is a unitary hnear operator. So, if we know how it acts on the states of the 
computational basis, we also know how it acts on a generic state 



N~l 



a=0 



The Fourier transform of can be performed indistinctly using either H16|l or (|17|) . We 
will use the latter. 

To prove that {\ipk) : k = 0, . . . , N — 1} is an orthonormal basis, i.e., 



(V'fc'IV'fc 



s. 



k'k, 



we can use the identity 
1 

N 



N-l 
j=0 



^2-Kijk/N 



1 if A; is a multiple of N 
otherwise, 



(18) 



which is useful in the context of Fourier transforms. It is easy to verify that (|18)) is true. 
If fc is a multiple of A^, then q'^'^v^/n _ and the first case of the identity follows. If k is 
not a multiple of N ^ (|18|) is true even if is not a power of 2. Fig. d shows each term 
^2mjk/N _ ...J 6) for the case k = 1 and = 7, as vectors in the complex plane. 
Note that the sum of vectors must be zero by a symmetry argument: the distribution 
of vectors is isotropic. Usually it is said that the interference is destructive in this case. 
Using this identity, we can define the inverse Fourier transform, which is similar to (|17|) . 
just with a minus sign on the exponent. Note that DFT"^ = DFT^ since DFT is a 
unitary operator. 

We will present the details of a quantum circuit to perform the Fourier transform in 
Section |H1 Now we will continue the calculation process of the circuit of Fig. El We are 
ready to find out the next state of the quantum computer — |V'4)- Applying the inverse 
Fourier transform on the first register, using Eq. ()17|) and the linearity of DFT^^, we 
obtain 



IV'4 



DFTtdV's)) 



2*-l 



a=0 \ ^ ^ j=0 

Inverting the summation order, we have 



\ r 



( 2*-l 

E 



^ — 1 
— V 



— 27Tija 



a=0 



,-27riifeo/2* 



\J) 



(19) 
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Using ()18() . we see that the expression in square brackets is not zero if and only if 
j = k2*/r, with k = 0, ...,r — 1. When j takes such values, the expression in the square 
brackets is equal to 1. So we have 



1 




X 



bo 



(20) 



In order to find r, the expression for has two advantages over the expression for 
lips) (Eq. (|15)) ): r is in the denominator of the ket label and the random parameter bo 
moved from the ket label to the exponent occupying now a harmless place. Fig. [H] shows 
the probability distribution of measured in the computational basis. Measuring the 
first register, we get the value k()2^/r, where kQ can be any number between and r — 1 
with equal probability (the peaks in Fig. IHJ. If we obtain ko = 0, we have no clue at 



Probability distribution 



2i 
r 



_l 1 1 ^ 

2^ 22i 32^ Terms of |^4> 

''' r r ^-[^g^ register) 



Figure 8: Probability distribution of \^4) measured in the computational basis. 
The horizontal axis has 2* points, only the non-null terms are shown. The number 
of peaks is r and the period is 2*/^. 
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all about r, and the algorithm must be run again. If kQ ^ 0, we divide ko2''/r by 2*, 
obtaining ko/r. Neither ko nor r are known. If ko is coprime to r, we simply select the 
denominator. 

If kQ and r have a common factor, the denominator of the reduced fraction ko/r is 
a factor of r but not r itself. Suppose that the denominator is ri. Let r = rir2- Now 
the goal is to find r2, which is the order of x*"^. We run again the quantum part of 
the algorithm to find the order of x*"^. If we find r2 in the first round, the algorithm 
halts, otherwise we apply it recursively. The recursive process does not last, because the 
number of iterations is less than or equal to log2 r. 

Take = 15 as an example, which is the least nontrivial composite number. The set 
of numbers less than 15, coprime to 15 is {1, 2, 4, 7, 8, 11, 13, 14}. The numbers in the set 
{4, 11, 14} have order 2 and the numbers in the set {2, 7, 8, 13} have order 4. Therefore, 
in any case r is a power of 2 and the factors of = 15 can be found in a 8-bit quantum 
computer {t+n = 2[log2 15] = 8). The authors of jJJ used a 7-qubit quantum computer, 
bypassing part of the algorithm. 

7 Generalization by means of an example 

In the previous sections, we have considered a special case when the order r is a power of 2 
and t = n (t is the number of qubits in the first register — see Fig. E] — and n = [log2 -^1 )• 
In this section, we consider the factorization of = 21, that is the next nontrivial 
composite number. We must choose t such that 2* is between N'^ and 2A^^, which is 
always possible j^. For N = 21, the smallest value of t is 9. This is the simplest example 
allowed by the constraints, but enough to display all properties of Shor's algorithm. 

The first step is to pick up x at random such that 1 < x < N , and to test whether x 
is coprime to N. If not, we easily find a factor of N by calculating GCD(x, A^). If yes, 
the quantum part of the algorithm starts. Suppose that x = 2 has been chosen. The 
goal is to find out that the order of x is r = 6. The quantum computer is initialized in 
the state 



where the first register has t = 9 qubits and the second has n = 5 qubits. Next step is 
the application of H^^ on the first register yielding (see Eq. ()12pl 



lV'o) = |0)|0), 




The next step is the application of Vx (defined in (fTT|) ). which yields 





|0)|1) + |1)|2) + |2)|4) + |3)|8)+ |4)|16)+ |5) |11) + 



|6) |1) + |7) |2) + |8) |4) + |9) |8) + |10) |16) + |11) |11) + 



12)|1) + ... 
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Notice that the above expression has the following pattern: the states of the second 
register of each "column" are the same. Therefore we can rearrange the terms in order 
to collect the second register: 



I -02 



1 



/5I2 



|0)+ |6) + |12) + 

(|1)+ |7) + |13) + 
(|2)+ |8) + |14) + 
(|3)+ |9) + |15) + 
(|4) + |10) + |16) + 

(|5)+ |11) + |17) + 



+ |504) + |510)) |1) + 

+ |505) + |511))|2) + 
+ |506)) |4) + 
+ |507)) |8) + 
+ |508)) |16) + 

+ |509)) 111) 



(21) 



This feature was made explicit in Eq. dJ. Because the order is not a power of 2, 
here there is a small difference: the first two lines of Eq. H21|) have 86 terms, while the 
remaining ones have 85. 

Now one measures the second register^, yielding one of the following numbers equiprob- 
ably: {1,2,4,8, 16, 11}. Suppose that the result of the measurement is 2, then 



IV'3 



1 



(|1) + |7) + |13) + . . . + |505) + |511)) |2) 



(22) 



Notice that the state iV'a) was renormalized in order to have unit norm. It does not 
matter what is the result of the measurement; what matters is the periodic pattern of 
1)221) . The period of the states of the first register is the solution to the problem and the 
Fourier transform can reveal the value of the period. So, the next step is the application 
of the inverse Fourier transform on the first register of {ips): 



IV'4 



DFTtdV^s)) 

DFTt ( ^y"|6a + 1) 1 12 

^ 511 / r ^ 85 



j=0 



a=0 



|J> |2>, 



(23) 



where we have used Eq. H17|) and have rearranged the sums. The last equation is similar 
to Eq. ()19)) . but with an important difference. In Sectional we were assuming that r 
divides 2*. This is not true in the present example (6 does not divide 512), therefore 
we cannot use the identity (|18)) to simplify the term in brackets in Eq. ()23j) . This term 
never vanishes, but its main contribution is still around j = 0, 85, 171, 256, 341, 427, 
which are obtained rounding 512A;o/6 for ko from to 5 — compare to the discussion that 
follows Eq. ((201) • To see this, let us plot the probability of getting the result j (in the 
interval to 511) by measuring the first register of the state |V'4)- From (|23() . we have 



^As measurements can always be performed in the end (see 
It is commonly used to simplify the expressions that follow. 



page 186), this step is not necessary. 
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that the probabihty is 



Prob(j) 



512 X 86 



85 



e ""'512 

a=0 



(24) 



The plot of Prob(j) is shown in Fig. ^ We see the peaks around j = 0, 85, 171, 256, 
341, 427, indicating a high probabihty of getting one of these values, or some value very 
close to them. In between, the probability is almost zero. The sharpness of the peaks 
depends on t (number of qubits in the first register). The lower limit 2* > N'^ ensures 
a high probability in measuring a value of j carrying the desired information. A careful 
analysis of the expression ()24() is performed in ^2 smd a meticulous study of the peak 
form is performed in |13j . 

Let us analyze the possible measurement results. If we get j = (first peak), the 
algorithm has failed in this round. It must be run again. We keep x = 2 and rerun the 
quantum part of the algorithm. The probability of getting j = is low: from Eq. H24() 
we have that Prob(O) = 86/512 ~ 0.167. Now suppose we get j = 85 (or any value in 
the second peak). We divide by 512 yielding 85/512, which is a rational approximation 
of ko/G, for ko = 1. How can we obtain r from 85/512? 

The method of continued fraction approximation allows one to extract the desired 
information. A general continued fraction expansion of a rational number ji/j2 has the 
form 

ii ^ 1 



J2 ai + — 

ap 

usually represented as [oq, ai, ap], where oq is a non- negative integer and oi, Op are 
positive integers. The q-th convergent (0 < g < p) is defined as the rational number 
[ao,ai, ■■■,aq]. It is an approximation to ji/j2 and has a denominator smaller than j2. 

This method is easily applied by inversion of the fraction followed by integer division 
with rational remainder. Inverting 85/512 yields 512/85, which is equal to 6 + 2/85. We 
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Figure 9: Plot of Prob(j) against j. Compare to the plot of Fig. |H1 where peaks 
are not spread and have the same height. 
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repeat the process with 2/85 until we get numerator 1. The result is 

85 _ 1 

So, the convergents of 85/512 are 1/6, 42/253, and 85/512. We must select the con- 
vergents that have a denominator smaller than = 21 (since r < Nf. This method 
yields 1/6, and then r = 6. We check that 2^ = 1 modulo 21, and the quantum part 
of the algorithm ends with the correct answer. The order r = 6 is an even number, 
therefore GCD(2(6/2) ^ ^^21) gives two non trivial factors of 21. A straightforward cal- 
culation shows that any measured result in the second peak (say 81 < j < 89) yields the 
convergent 1/6. 

Consider now the third peak, which corresponds to ko/G, = 2. We apply again 
the method of continued fraction approximation, which yields 1/3, for any j in the third 
peak (say 167 < j < 175). In this case, we have obtained a factor of r (ri = 3), since 
2^ = 8 ^ 1 modulo 21. We run the quantum part of the algorithm again to find the order 
of 8. We eventually obtain r2 = 2, which yields r = rir2 = 3x2 = 6. 

The fourth and fifth peaks yield also factors of r. The last peak is similar to the 
second, yielding r directly. 

The general account of the succeeding probability is as follows. The area under 
all peaks is approximately the same: ~ 0.167. The first and fourth peaks have a nature 
different from the others — they are not spread. To calculate their contribution to the total 
probability, we take the basis equal to 1. The area under the second, third, fifth, and last 
peaks are calculated by adding up Prob(j), for j running around the center of each peak. 
So, in approximately 17% cases, the algorithm fails (1st peak). In approximately 33% 
cases, the algorithm returns r in the first round (2nd and 6th peaks). In approximately 
50% cases, the algorithm returns r in the second round or more (3rd, 4th, and 5th peaks). 
Now we calculate the probability of finding r in the second round. For the 3rd and 5th 
peaks, the remaining factor is r2 = 2. The graph equivalent to Fig. IHlin this case has 
2 peaks, then the algorithm returns r2 in 50% cases. For the 4th peak, the remaining 
factor is r = 3 and the algorithm returns r2 in 66.6% cases. This amounts to 2><50%+66.6% 
of 50%, which is equal to around 22%. In summary, the success probability for x = 2 is 
around 55%. 



8 Fourier transform in terms of the universal gates 

In the previous section, we have shown that Shor's algorithm is an efficient probabilistic 
algorithm, assuming that the Fourier transform could be implemented efficiently. In this 
section, we decompose the Fourier transform in terms of the universal gates: CNOT 
and 1-qubit gates. This decomposition allows one to measure the efficiency of the quan- 
tum discrete Fourier transform and shows how to implement it in an actual quantum 
computer. 

^The inequality r < ^{N) follows from the Euler's theorem: x*''-^' = 1 mod A'', where a; is a positive 
integer coprime to and (p is the Euler's totient function (ifiiN) gives the number of positive integers 
less than A'', coprime to A'^). The inequality ^p{N) < N follows from the definition of ip. (see (14) page 
492) 
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The Fourier transform of the states of the computational basis is 



DFT(|j)) = ^^e2-^-'=/^|A:). (25) 



Noting that the right hand side of Eq. (|25|) has N terms and the computational basis has 
N states, we derive that the complexity to calculate classically the Fourier transform of 
the computational basis using Eq. (|25j) is 0{N'^) = 0(2^"') - double exponential growth. 
A very important result in Computer Science was the development of the classical fast 
Fourier transform (EFT), which reduced the complexity to 0(n2") ^^1- In the present 
context we show the improvement by recognizing that the rhs of (|25j) is a very special 
kind of expansion, which can be fully factored. For example, the Fourier transform of 
{|0), |1), 1 2), 1 3)} can be written as 

|0) + |1)\ ^ /|0) + |1 



\/2 y V \/2 

|o)-|l)^ ^ ^io>+^ii 



\/2 y V \/2 

|0)-|1)\ ^ /|0)-i|l 



DFT(|0)) = 

DFT(|1)) = 

DFT(,2)) ^ (M^) ^ (MHl) , 

DFT(|3)) = 



\/2 y V \/2 

Note that in example (|26() . we are using base 2 in order to factor the rhs. Let us now 
factor the general expression. The first step is to write (|25)) in the form 



DFT(li)) = ^ E • • • E ^ . . . \kn) , (27) 

^ fci=0 k„=0 

where the ket \k) was converted to base 2 and we have used the expansion k = Yl^=i 

in the exponent. Using that the exponential of a sum is a product of exponentials, H27() 

turns into a (non-commutative) product of the following kets: 

DFT(lj)) = E • • • E n (^'^^'^ 1^')) • (28) 
Now we factor ()28|) by interchanging the sums and the product: 

DFT(|j)) = -^nE^'^^'^l^') • (29) 

We easily convince ourselves that the last equation is correct by going backwards: simply 
expand the product in Eq. ()29() and then put all sums at the beginning of the resulting 
expression to obtain (|28|) . Expanding the sum of Eq. (|29|) and then the product, we 
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finally get 
DFT(|j)) 



,27rii/2' M\ 



1=1 



V2 



V2 



). . .(g) 



|0) + e2^^j^ |1) 
71 



(30) 



The complexity to calculate Eq. ()3Up for one |j) is 0{n), since there are n terms in 
the product. The complexity in the classical calculation of the fast Fourier transform 
of the whole computational basis is still exponential - 0{n2"'), since the calculation is 
performed on each of the 2" basis elements, one at a time. On the other hand, the 
quantum computer uses quantum parallelism, and the Fourier transform of the state 

2"-l 



a=0 

that has an exponential number of terms, is calculated with one application of the quan- 
tum Fourier transform. The Fourier transform of the 2" basis elements is performed 
simultaneously, so the complexity of the quantum Fourier transform is measured by the 
size of its circuit. We now show that it requires O(ra^) gates. 

Consider the circuit of Fig. EH It is easy to check that the value of the qubits \jm), 
m ^ I, does not change. Let us now check the hard one: \ji). The unitary matrices 
are defined as 

' 1 

exp (27ri^) 

Each Rk gate is controlled by the qubit \jk+i-i)- So, if jk+i-i = 0, then Rk must be 
replaced by the identity matrix (no action), and if j^+i-i = 1, then i?^ comes in action. 
This means that, for calculation purposes, the RkS controlled by can be replaced 

by the 1-qubit gates 

" 1 



Rh 



CRk 



exp 



2vri^-^ 



(31) 



— H — R2 

b^+i) 



Rn + l~l — \'4>) 



-i- 



\jn) 



Figure 10: Part of the quantum Fourier transform circuit that acts on qubit 
\ji). The value of all qubits does not change, except \ji) that changes to |^) = 

|0)+e^"'2>» + i-' |i) 
%/2 
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In order to simplify the calculations, note that 

|0) + e2-^f |l) 



H\ji) 



V2 



CRi\+) 



where |+) = -^(lO) + |1)). So instead of using 

IV^) = CRn+l-l...CR2H\ji), 

which can be read directly from Fig. IIUI we will use 

= CRn+i-i . . . CR2CRi\+) . 

We define 

1 

PRn+l^l = n ^^fc' 
k=n+l-l 

where the product is in the reverse order. Using H31() and ^i'6\\ . we get 



PR 



n+l-l 



1 

1 

71 



1 







exp27^^^ 2MTrT 

1 

exp (27ri ^„_j^i_i 



+ ...+ 



(32) 



(33) 



(34) 



where we have used that j = '}2^=i jm^^ and the fact that the first / — 1 terms of this 
expansion do not contribute — they are integer multiples of 27ri in (|34j) . We finally get 



PRn+l-l\+) 

^/2 



(35) 



1+) — PRr. 
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Figure 11: Intermediate circuit for the quantum Fourier Transform. The input 
is taken as |+) for calculation purposes as explained in Eq. H32|l . The output is 
in reverse order with respect to Eq. ^U^. 
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IV') 



W) 



Figure 12: The swap circuit. 



Note that Pi?„+i_; cannot be implemented directly acting only in the /-th qubit because 
it requires the values of j^+i to jn- 

The next step is the circuit of Fig. ^2 We have merged the Rk gates using Eq. . 
The gates PRk [k from n to 1) are placed in sequence in Fig. 1111 so that the output of 
the first qubit is the last term of Eq. (|nn|) . corresponding to the action of PRn on 
controlled by the other qubits, which do not change. The same process is repeated by 
PRn-i acting on |V'2)5 yielding the term before the last in Eq. (|.Sn|) . and so on, until 
reproducing all the terms of the Fourier transform. Now it remains to reverse the order 
of the states of the qubits. 

In order to reverse the states of 2 generic qubits, we use the circuit of Fig. 1121 Let 
us show why this circuit works as desired. Take the input |(/?) = |0) |1). The first 
CNOT of Fig. El does not change this state; the upside down CNOT changes to |1) |1); 
and the last CNOT changes to |1) |0). The output is \il))\ip). If we repeat the same 
process with |0) |0), |1) |0), and |1) |1), we conclude that the circuit inverts all states of 
the computational basis, therefore it inverts a generic state of the form \(p) |^). 

The decomposition is still not complete. It remains to write the controlled R^ gates 
in terms of CNOT and 1-qubit gates. This decomposition is given in Fig. 1131 The 
verification of this decomposition is straightforward. One simply follows what happens 
to the computational basis {|00) , |01) , |10) , |11)} in both circuits. 

The complete circuit for the quantum Fourier transform is given in Fig. 1141 Now we 
can calculate the complexity of the quantum Fourier circuit. Counting the number of 
elementary gates in Figs. ^1 to El we get the leading term bv? /2, which implies that the 
complexity is O(n^). 

By now one should be asking about the decomposition of Vx in terms of the elementary 
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Figure 13: Decomposition of the controlled Rk gates in terms of the universal 
gates. 
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Figure 14: The complete circuit for the quantum Fourier Transform. 



gates. Vx is the largest gate of Fig. IHI Actually, Shor stated in his 1997 paper that 
Vx is the "bottleneck of the quantum factoring algorithm" due to the time and space 
consumed to perform the modular exponentiation (see page 10). The bottleneck is 
not so strict though since, by using the well known classical method of repeated squaring 
and ordinary multiplication algorithms (see Jl] page 69), the complexity to calculate 
modular exponentiation is O(n^). The quantum circuit can be obtained from the classical 
circuit by replacing the irreversible classical gates by the reversible quantum counterpart. 

is a problem in recursive calls of the algorithm when x changes. For each x, a new 
circuit must be built, what is troublesome at the present stage of hardware development. 
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